The Forward-Forward Algorithm: Some Preliminary Investigations

#学習アルゴリズム #gradient-free_optimization Forward gradient learning #FFA #論文 #Forward-Forward_algorithm

誤差逆伝播法に代わって強化学習で用いられていたForwardForwardアルゴリズムをニューラルネットワークの学習に使う方法の提案

web link

The Forward-Forward Algorithm: Some Preliminary Investiga... - Google Scholar

code:citation

Hinton, Geoffrey E.. “The Forward-Forward Algorithm: Some Preliminary Investigations.” ArXiv abs/2212.13345 (2022): n. pag.

その他

日本語解説スライド

【DL輪読会】The Forward-Forward Algorithm: Some Preliminary

Twitter投稿1

FFアルゴリズムはバックプロパゲーションを必要としない、ボルツマンマシンとNoise Contrastive Estimationから編み出されたモデル．ただ、大規模なモデルでは逆伝播ありのモデルの方がまだ良い

逆伝播を使わないことで脳科学視点のニューロンとNNとの関連性を研究できる部分と適当な電子回路を用意すれば微小の電力で動くこと可能にするという点はメリットとして挙げられる．あとGANにも関連があるらしい．

バックプロパゲーションを使わずにどうやって学習するの？という問いには、

１．データとラベルを１つにする．

２．間違ったラベルとデータを１つにしたものも用意する．

３．NNに１番と２番のデータを流していく．

４．各ユニットが正解のデータと正解のラベルが来たときは"良さ"を上げて、間違ったデータとラベルが来たら"良さ"を下げるように学習する．

５．推論時は３番と同じようにデータを流して、各ユニットの"良さ"を確認することでタスクを解くことができる．

おかげで微分不可能な活性化関数もぶち込める（逆伝播必要ないから）のでラッキーだね！という感じ．

from: twitter

Twitter投稿2

「先月のNeurIPS2022での、ヒントン先生の発表、「The Forward-Forward Algorithm for Training Deep Neural Networks」のKeras実装解説になります

（先月のヒントン先生講演はこちら） https://neurips.cc/virtual/2022/invited-talk/55869

from: Twitter

abstract

The aim of this paper is to introduce a new learning procedure for neural networks and to demonstrate that it works well enough on a few small problems to be worth serious investigation.

この論文の目的は、ニューラルネットワークの新しい学習方法を紹介し、それがいくつかの小さな問題で十分にうまく機能することを示し、真剣に検討する価値があることを示すことです。

The Forward-Forward algorithm replaces the forward and backward passes of backpropagation by two forward passes, one with positive (i.e.real) data and the other with negative data which could be generated by the network itself.

Forward-Forwardアルゴリズムはバックプロパゲーションの前方パスと後方パスを2つの前方パスで置き換える。

Each layer has its own objective function which is simply to have high goodness for positive data and low goodness for negative data.

各層はそれぞれ目的関数を持ち、それは単純に「positive data」に対しては「high goodness」を持ち、「negative data」に対しては「low goodness」を持つことである。

The sum of the squared activities in a layer can be used as the goodness but there are many other possibilities, including minus the sum of the squared activities.

各層の活動の二乗の和を「goodness」として用いることもできるが、活動の二乗の和をマイナスする等、他にも多くの可能性がある。

If the positive and negative passes can be separated in time, the negative passes can be done offline, which makes the learning much simpler in the positive pass and allows video to be pipelined through the network without ever storing activities or stopping to propagate derivatives.

ポジティブパスとネガティブパスを時間的に分離することができれば、ネガティブパスはオフラインで行うことができ、ポジティブパスでの学習はよりシンプルになり、アクティビティを保存したり導関数を伝播するのを止めることなくネットワークを通してビデオをパイプラインで送ることができるようになる。

2024/6/6 13:34

original: /tomiokario-close/The Forward-Forward Algorithm: Some Preliminary Investigations